Max-Margin Tensor Neural Network for Chinese Word Segmentation
نویسندگان
چکیده
Recently, neural network models for natural language processing tasks have been increasingly focused on for their ability to alleviate the burden of manual feature engineering. In this paper, we propose a novel neural network model for Chinese word segmentation called Max-Margin Tensor Neural Network (MMTNN). By exploiting tag embeddings and tensorbased transformation, MMTNN has the ability to model complicated interactions between tags and context characters. Furthermore, a new tensor factorization approach is proposed to speed up the model and avoid overfitting. Experiments on the benchmark dataset show that our model achieves better performances than previous neural network models and that our model can achieve a competitive performance with minimal feature engineering. Despite Chinese word segmentation being a specific case, MMTNN can be easily generalized and applied to other sequence labeling tasks.
منابع مشابه
Reasoning Over Relations Based on Chinese Knowledge Bases
Knowledge bases are useful resource for many applications, but reasoning new relationships between new entities based on them is difficult because they often lack the knowledge of new relations and entities. In this paper, we introduce the novel Neural Tensor Network (NTN)[1] model to reason new facts based on Chinese knowledge bases. We represent entities as an average of their constituting wo...
متن کاملLong Short-Term Memory Neural Networks for Chinese Word Segmentation
Currently most of state-of-the-art methods for Chinese word segmentation are based on supervised learning, whose features aremostly extracted from a local context. Thesemethods cannot utilize the long distance information which is also crucial for word segmentation. In this paper, we propose a novel neural network model for Chinese word segmentation, which adopts the long short-term memory (LST...
متن کاملDependency-based Gated Recursive Neural Network for Chinese Word Segmentation
Recently, many neural network models have been applied to Chinese word segmentation. However, such models focus more on collecting local information while long distance dependencies are not well learned. To integrate local features with long distance dependencies, we propose a dependency-based gated recursive neural network. Local features are first collected by bi-directional long short term m...
متن کاملLong Short-Term Memory for Japanese Word Segmentation
This study presents a Long Short-Term Memory (LSTM) neural network approach to Japanese word segmentation (JWS). Previous studies on Chinese word segmentation (CWS) succeeded in using recurrent neural networks such as LSTM and gated recurrent units (GRU). However, in contrast to Chinese, Japanese includes several character types, such as hiragana, katakana, and kanji, that produce orthographic ...
متن کاملTraining Global Linear Models for Chinese Word Segmentation
This paper examines how one can obtain state of the art Chinese word segmentation using global linear models. We provide experimental comparisons that give a detailed road-map for obtaining state of the art accuracy on various datasets. In particular, we compare the use of reranking with full beam search; we compare various methods for learning weights for features that are full sentence featur...
متن کامل